Preconditioning for Hessian-Free Optimization

نویسندگان

  • Robert Seidl
  • Thomas Huckle
  • Michael Bader
چکیده

Recently Martens adapted the Hessian-free optimization method for the training of deep neural networks. One key aspect of this approach is that the Hessian is never computed explicitly, instead the Conjugate Gradient(CG) Algorithm is used to compute the new search direction by applying only matrix-vector products of the Hessian with arbitrary vectors. This can be done efficiently using a variant of the backpropagation algorithm. Recent algorithms use diagonal preconditioners to reduce the needed iterations of the CG algorithm. They are used because of their easy calculation and application. Unfortunately in later stages of the optimization these diagonal preconditioners are not as well suited for the inner iteration as they are for the optimization in the earlier stages. This is mostly due to an increased number of elements of the dense Hessian having the same order of magnitude near an optimum. We construct a sparse approximate inverse preconditioner (SPAI) that is used to accelerate the inner iteration especially in the later stages of the optimization. The quality of our preconditioner depends on a predefined sparsity pattern. We apply the knowledge of the pattern of the Gauss-Newton approximation of the Hessian to efficiently construct the needed pattern for our preconditioner which can then be computed efficiently fully in parallel using GPUs. This preconditioner is then applied to a deep auto-encoder test case using different update strategies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Preconditioning the Pressure Tracking in Fluid Dynamics by Shape Hessian Information

Potential flow pressure matching is a classical inverse design aerodynamic problem. The resulting loss of regularity during the optimization poses challenges for shape optimization with normal perturbation of the surface mesh nodes. Smoothness is not enforced by the parameterization but by a proper choice of the scalar product based on the shape Hessian, which is derived in local coordinates fo...

متن کامل

Dynamic scaling based preconditioning for truncated Newton methods in large scale unconstrained optimization

This paper deals with the preconditioning of truncated Newton methods for the solution of large scale nonlinear unconstrained optimization problems. We focus on preconditioners which can be naturally embedded in the framework of truncated Newton methods, i.e. which can be built without storing the Hessian matrix of the function to be minimized, but only based upon information on the Hessian obt...

متن کامل

Towards Matrix-Free AD-Based Preconditioning of KKT Systems in PDE-Constrained Optimization

The presented approach aims at solving an equality constrained, finite-dimensional optimization problem, where the constraints arise from the discretization of some partial differential equation (PDE) on a given space grid. For this purpose, a stationary point of the Lagrangian is computed using Newton’s method, which requires the repeated solution of KKT systems. The proposed algorithm focuses...

متن کامل

A preconditioning technique for a class of PDE-constrained optimization problems

We investigate the use of a preconditioning technique for solving linear systems of saddle point type arising from the application of an inexact Gauss–Newton scheme to PDE-constrained optimization problems with a hyperbolic constraint. The preconditioner is of block triangular form and involves diagonal perturbations of the (approximate) Hessian to insure nonsingularity and an approximate Schur...

متن کامل

An Efficient Dimer Method with Preconditioning and Linesearch

The dimer method is a Hessian-free algorithm for computing saddle points. We augment the method with a linesearch mechanism for automatic step size selection as well as preconditioning capabilities. We prove local linear convergence. A series of numerical tests demonstrate significant performance gains.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012